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Systemotic Gene Search in the Incyte LifeSeq Database 




Comparison of databases 
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Principle of EST Assembly 
~ 50,000 ESTs per tissue 





Assembly ot 0% mismatch 
with GAP4 (Staden) 
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Contigs increasing in 
number and length 
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Iterative assembly with 
increasing mismatch 
(1%, 2%, 4%) 



~ 25,000 other individual 
5000-6000 Contigs sequences 



v 
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sequences per tissue 



FIG. 2a 







err 


CLASS 


SUBCLASS I 


DRAFTSMAN 




1 



09/6734 



~ 50,000 ESTs 
of a tissue 
(e.g.: uterus tumor) 



GAP4 Assembly 1st Round: 
minimum initial match: 20 
maximum number of inserted 
blanks per sequence: 8 
maximum percent mis match: 0 



GAP4-Database 1: 




Contigs 1 
Individual Sequences 1 



unassembled 
ESTs 




GAP4 Assembly 2nd Round: 
minimum initial match: 20 
maximum number of inserted 
blanks per sequence: 8 
maximum percent mismatch: 1 
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GAP4-Database 2: 
Contigs 2 

Individual Sequences 2 
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GAP4 Assembly 3rd Round: 
minimum initial match: 20 
maximum number of inserted 
blanks per sequence: 8 
maximum percent mismatch: 2 



GAP4-Database 3: 





Contigs 3 

Individual Sequences 3 



unassembled 
ESTs 
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GAP4 Assembly 4th Round: 
minimum initial match: 20 
maximum number of inserted 
blanks per sequence: 8 
maximum percent mismatch: 2 
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GAP4 Assembly 5th Round: 
minimum initial match: 20 
maximum number of inserted 
blanks per sequence: 8 
maximum percent mismatch: 4 
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GAP4 Assembly 6th Round: 
minimum initial match: 20 
maximum number of inserted 
blanks per sequence: 8 
maximum percent mismatch: 4 
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Assembled database 
of a specific tissue 
(e.g.: uterus tumor) 
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Assembled database 
of a specific tissue 
(e.g.: uterus tumor) 
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Read-in as individual 
sequences 




Database 
of a specific tissue 
(e.g.: uterus tumor) 




Database of a second 

specific tissue 
(e.g.: normal - uterus) 



GAP4 Assembly 
mimmui^nittal-meteh7-2©- 



maximum number of inserted 
blanks per sequence: 8 
maximum percent mis match: 4 





Tumor tissue- 
specific ESTs 



Non-tissue- 
specific ESTs 
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Normal tissue- 
specific ESTs 



FIG. 2b-4 



'HUVfcU 








CLASS 


SUBCLASS j 


y.FTSMAN 







09/67340 



In silico subtraction of gene expression in various tissues 



'30,000 consensus sequences 
normal tissue 



'30,000 consensus sequences 
tumor tissue 




Assembly at 4% mismatch 
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Specific genes 

Genes expressed in both tissues 
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Specific genes 
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Candidate genes for tumor 
suppressors or tumor activators 
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Electronic Northern Blot 
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Fisher's Exact Test 
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Automatic Lengthening 




.ATGTCCTA GCCTCAAGTTATC AG ATGCAA. 
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Isolation of genomic BAC and PAC clones 
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Chromosomal clone localization via FISH 
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Hybridization signal 



^7 




Sequencing of clones that are located in regions that have 
chromosomal deletions in prostate and breast cancer leads to 
identification of candidate genes 
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Confirmation of candidate genes by screening of 
mutations and/or deletions in cancer tissues 



FIG. 5 



